Search CORE

111 research outputs found

Multiple Hypothesis Testing in Pattern Discovery

Author: Garriga Gemma C.
Hanhijärvi Sami
Puolamäki Kai
Publication venue
Publication date: 01/01/2009
Field of study

The problem of multiple hypothesis testing arises when there are more than one hypothesis to be tested simultaneously for statistical significance. This is a very common situation in many data mining applications. For instance, assessing simultaneously the significance of all frequent itemsets of a single dataset entails a host of hypothesis, one for each itemset. A multiple hypothesis testing method is needed to control the number of false positives (Type I error). Our contribution in this paper is to extend the multiple hypothesis framework to be used with a generic data mining algorithm. We provide a method that provably controls the family-wise error rate (FWER, the probability of at least one false positive) in the strong sense. We evaluate the performance of our solution on both real and generated data. The results show that our method controls the FWER while maintaining the power of the test.Comment: 28 page

arXiv.org e-Print Archive

Aaltodoc Publication Archive

SIDE : a web app for interactive visual data exploration with subjective feedback

Author: De Bie Tijl
Kang Bo
Lijffijt Jefrey
Puolamäki Kai
Publication venue
Publication date: 01/01/2016
Field of study

Ghent University Academic Bibliography

Breaking of R-parity and supersymmetry in supersymmetric models

Author: Puolamäki Kai
Publication venue: Helsingfors universitet
Publication date: 01/01/2001
Field of study

Helsingin yliopiston digitaalinen arkisto

Inferring Intent and Action from Gaze in Naturalistic Behavior : A Review

Author: Lukander Kristian
Puolamäki Kai
Toivanen Miika
Publication venue
Publication date: 01/10/2017
Field of study

Peer reviewe

Helsingin yliopiston digitaalinen arkisto

SLISEMAP: Explainable Dimensionality Reduction

Author: Björklund Anton
Mäkelä Jarmo
Puolamäki Kai
Publication venue
Publication date: 01/01/2022
Field of study

Existing explanation methods for black-box supervised learning models generally work by building local models that explain the models behaviour for a particular data item. It is possible to make global explanations, but the explanations may have low fidelity for complex models. Most of the prior work on explainable models has been focused on classification problems, with less attention on regression. We propose a new manifold visualization method, SLISEMAP, that at the same time finds local explanations for all of the data items and builds a two-dimensional visualization of model space such that the data items explained by the same model are projected nearby. We provide an open source implementation of our methods, implemented by using GPU-optimized PyTorch library. SLISEMAP works both on classification and regression models. We compare SLISEMAP to most popular dimensionality reduction methods and some local explanation methods. We provide mathematical derivation of our problem and show that SLISEMAP provides fast and stable visualizations that can be used to explain and understand black box regression and classification models

Helsingin yliopiston digitaalinen arkisto

SLISEMAP: supervised dimensionality reduction through local explanations

Author: Björklund Anton
Mäkelä Jarmo
Puolamäki Kai
Publication venue
Publication date: 18/05/2022
Field of study

Existing methods for explaining black box learning models often focus on building local explanations of the models’ behaviour for particular data items. It is possible to create global explanations for all data items, but these explanations generally have low fidelity for complex black box models. We propose a new supervised manifold visualisation method, slisemap, that simultaneously finds local explanations for all data items and builds a (typically) two-dimensional global visualisation of the black box model such that data items with similar local explanations are projected nearby. We provide a mathematical derivation of our problem and an open source implementation implemented using the GPU-optimised PyTorch library. We compare slisemap to multiple popular dimensionality reduction methods and find that slisemap is able to utilise labelled data to create embeddings with consistent local white box models. We also compare slisemap to other model-agnostic local explanation methods and show that slisemap provides comparable explanations and that the visualisations can give a broader understanding of black box regression and classification models.Peer reviewe

arXiv.org e-Print Archive

Helsingin yliopiston digitaalinen arkisto